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Abstract 

We present a method of noise level estimation that is valid even for high noise levels. The 
method makes use of the functional dependence of coarse grained correlation entropy K2(s) on the 
threshold parameter e. We show that the function K-2(e) depends in a characteristic way on the 
noise standard deviation a. It follows that observing Ki(s) one can estimate the noise level a. 
Although the theory has been developed for the gaussian noise added to the observed variable we 
have checked numerically that the method is also valid for the uniform noise distribution and for 
the case of Langevine equation corresponding to the dynamical noise. We have verified the validity 
of our method by applying it to estimate the noise level in several chaotic systems and in the Chua 
electronic circuit contaminated by noise. 



*Electronic address: urbanow@if.pw.edu.pl 
^Electronic address: jholyst@if.pw.edu.pl 



I. INTRODUCTION 



It is a common case that observed data are contaminated by a noise (for a review of 
methods of nonlinear time series analysis see 

HQ). 

The presence of noise can substan- 
tially affect invariant system parameters as a dimension, entropy or Lyapunov exponents. 
In fact Schreiber [^] has shown that even 2% of noise can make a dimension calculation 
misleading. It follows that the assessment of the noise level can be crucial for estimation of 
system invariant parameters. Even after performing a noise reduction one is interested to 
evaluate the noise level in the cleaned data. In the experiment the noise is often regarded 
as a measurement uncertainty which corresponds to a random variable added to the system 
temporary state or to the experiment outcome. This kind of noise is usually called the 
measurement or the additive noise. Another case is the noise influencing the system dy- 
namics, what corresponds to the Langevine equation and can be called the dynamical noise. 
The second case is more difficult to analyze because the noise acting at moment to usually 
changes the trajectory for t > to- It follows that there is no clean trajectory and instead 
of it an e-shadowed trajectory occurs j^]. For real data a signal (e.g. physical experiment 
data or economic data) is subjected to the mixture of both kinds of noise (measurement and 
dynamical) . 

Schreiber has developed a method of noise level estimation 3] by evaluating the influence 
of noise on the correlation dimension of investigated system. The Schreiber method is valid 
for rather small gaussian measurement noise and needs values of the embedding dimension d, 
the embedding delay r and the characteristic dimension r spanned by the system dynamics. 

Diks 5] investigated properties of correlation integral with the gaussian kernel in the 
presence of noise. The Diks method makes use of a fitting function for correlation integrals 
calculated from time series for different thresholds e. The function depends on system 
variables K 2 (correlation entropy), D 2 (correlation dimension), o (standard noise deviation) 
and a normalizing constant $. These four variable are estimated using the least squares 
fitting. The Diks method Jf| is valid for a noise level up to 25% of signal variance and for 
various measurement noise distributions. The Diks's method needs optimal values of the 
embedding dimension d, the embedding delay r and the maximal threshold e c . 

Hsu et al. []| developed a method of noise reduction and they used this method for noise 
level estimation. The method explored the local-geometric-projection principle and is useful 
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for various noise distributions but rather small noise levels. To use the method one needs to 
choose a number of neighboring points to be regarded, an appropriate number of iterations 
as well as optimal parameters values d and r. 

Oltmans et al. |8| considered influence of noise on the probability density function f n (e) 
but they could take into account only a small measurement noise. They used a fit of f n (e) to 
the corresponding function which was found for small e. Their fitting function is similar to 
the probability density distribution that we receive from correlation integrals -^DET n (e). 
The method needs as input parameters values of d, r and e c . 

Our method has its origin in recurrence plots (RPs) ^ and it uses RPs quantities to 
characterize the data. Recurrence plots were originally introduced by Eckmann 9] as a 
useful graphic way for data analysis. The plot is defined as a matrix N x N where a dot (i, 
j) is drawn when Ua II < £ (£ is a given threshold). By recurrence plots one can study data 
stationarity 1(1 llll Il2|. as well as their recurrence and deterministic properties 13, 14, 15^ . 
The approach was also applied for parameter optimizing 16] in the local projection method 
of noise reduction [l^. RPs can be easy used to calculate characteristic system parameters 
like the correlation entropy what will be performed in our case. Lines of black dots 
parallel to the main diagonal can appear in recurrence plots and their number can serve as 
a measure of determinism Q|. In our method we take into account a number of lines DET n 
of the length n or longer by the embedding dimension d — 1. We use the fact that there is 
a straightforward relation between DET n and the correlation integral [lsj ]. 

The crucial point of our method is fitting of a proper function to the estimated correlation 
entr opy Ko . In fact similar considerations can be performed for Kolmogorov- Sinai entropy 



221 but in such much 



[19I l20l |21[ Ki using for example the approach given in 
larger number of data is needed since the K\ entropy is more sensitive to regions of the 
phase space with small values of invariant measure. The method is not too time consuming, 
e.g. a calculation of entropy for 100 various thresholds and N = 3000 data points needed a 
few minutes j^j. Our method does not demand any input parameters like the embedding 
dimension d or the embedding delay r. The minimal and maximal values of the threshold 
parameter e can be automatically estimated. In all considerations we use the maximum 
norm to save the computation time and to perform analytic expansions. It is known that in 
the limit e — > the behavior of invariant system parameters does not depend on the type of 
used norm. In our case features of coarse grained entropy are considered and the value of 
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the threshold parameter e should be comparable to the noise level. It follows that one can 
not exclude that the type of applied norm affects the functional dependence of the coarse 
grained entropy K 2 (e) in the presence of noise of a large or medium value. 

We stress here that our method is provided for a noise level estimation. The method is 
not equivalent to noise filters that allow to extract an original non disturbed signal from 



noisy time series 



24, 



II. ENTROPY ESTIMATION FOR A TIME SERIES IN THE NOISE ABSENCE 

Let {xi} where i = 1,2,...,N be a time series and ifi = {x^, Xi +T , Xi+( n _i) T } a corre- 
sponding n-dimensional vector constructed in the embedded space where n is an embedding 
dimension and r is an embedding delay. The correlation integral calculated in the embedded 
space yi is 

i TV N 

^) = ^EE^-Ibwj) (i) 

i j^i 

where 9 is the Heavyside step function. If ||...|| is the maximum norm the correlation integral 
C n (e) is proportional to the number DET n (e) of lines of the length n in the RP constructed 
from the data set {x^ |3| 

= EE^~ \ X i - X j\) 6 ( £ - \ X i+r -Xj+r\) 

i j^i 

Of \ DET n( £ ) /„N 

...0{e - x i+ ( n - 1)T - x j+(n _i )T ) = — — . (2) 



The correlation entropy j2y, \2J\ can now be calculated as 

K2 = lim lim l n ggZUfL „ JHDET n{e) ) 

^o«-» DET n+1 (e) dn v ; 

We assume that Eq. (j^J is approximately valid for n > 2 thus 

DET n = DET 2 e- {n ~ 2)K \ (4) 

Let us introduce the following convention for lines counting: if there is a line of the length n 
then it includes one line of the length n — 1, one line of the length n — 2 etc. Using Eq. (@J 
one can easy find the average line length (n) 

En=2 (DET n + DET n+2 - 2DET n+1 ) ■ n 



(n) 



En= 2 (DET n + DET n+2 - 2DET n+1 ) 

(5) 



The above formula neglects all lines of the length n = 1. Now the entropy can be approxi- 
mated as 

(6) 



The relation 



between the entropy, dimension and correlation integral is given by the well 



known formula 28, 



lim lim In -^—DET n (e) = D 2 In e - nrK 2 (7) 



thus the logarithm of the correlation integral is a linear function of entropy K 2 and system 
dimension D 2 . On the other hand the correlation dimension D 2 is independent of the 
embedding dimension d if the latter is large enough. We use this fact and in the next 
section we will estimate the noise effect on the dimension D 2 as well as on the length n of 
the line in RP where the line length corresponds to the embedding dimension. At the end 
we will incorporate both effects into Eq. (j2J to reproduce the complete influence of noise on 
the correlation integral. 



III. INFLUENCE OF NOISE ON CORRELATION INTEGRAL 

Let us modify the definition of DET n in such a way that the influence of noise on entropy 
can be analytically estimated. First we change Eq. ([T} to the equivalent form 



N N / I 
i i^j \k=0 



x i+k - x j+k \) - n ] (8) 

where I is the length of the recurrence line beginning at the point Eq. (jHj) is valid 

provided that one assumes 6(0) = 1 for the Heavyside function. The function 9 in Eq. ([TJ 
is called a kernel function [30] and it can be written in a general way as p £ (r). Now let us 
use the fact [30] that the kernel function can be replaced by any monotonically decreasing 
function p e (r) with a bandwidth e such that lim r _ >0 r_p P£( r ) — for e > and any p > 0. 
The bandwidth e of the kernel function corresponds to the threshold e. It follows that we 
can replace the inner 9 (e — r) function in Eq. (jHJ) by a new linear continuous function 

for < r < e , 

9(e-r)^p £ (r) = { ~ ~ } (9) 

for r > e 
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and simultaneously we lower the threshold in outer 9 function by the constant (3 = We 
have checked that other choices of (3 bring similar results. Now instead of Eq. (JSJ) we have 

DET' n {e) =J2J2e(± - (3n) . (10) 

i ijLj \k=0 £ I 

We use the above expression to calculate the mean line length (n). Practically the length 
of each line is calculated as the maximal value of the parameter n in Eq. provided that 
the 9 function equals to 1. Having (n) we calculate the system entropy K 2 using the Eq. ©. 

Now let us consider the influence of uncorrelated gaussian noise rji added to the observed 
system variable x%. The equation (fTU|) is replaced by the following approximation 

DET' n {e) = EE^fE + _ p\ 

i ijLj \k=0 8 ) 

% tyj \k=0 £ £ / 

where a is the standard noise deviation and a is a constant of order of 1 that depends on 
the distribution of \x% — Xj \ . One can easily derive Eq. (|TT|) assuming that a x ~ as where 
a x stands for a standard deviation of \%i — Xj\ G (0, e). When the differences \xi — Xj\ are 
uniformly distributed in the region (0, e) then a = 4=. 

Comparing Eq. (fTT| to Eq. (jHJ) and Eq. (|Tnj) we see that the effect of noise corresponds 
formally to the change 



_ Je 2 /3 + 2a 2 - e/V3\ 



Instead of the second part of lhs Eq. (J7J) we have 



Je 2 /3 + 2a 2 - e/V3\ 
- nrK 2 ^ -nrK 2 {e) 1 + v^F-^- ]. (13) 

For a small noise (a e) the last equation can be transformed to 

- nrK 2 -> -nrK 2 (e) (l + j (14) 



what is in agreement with the well known result |31|, |32| for the noise entropy in the case of 
noise spectrum S(u>) ~ uj~ 2 

K noisy ~ — . (15) 



The Eq. (|13|) expresses the influence of noise on the line length n. On the other hand 
Schreiber has shown |3J that the influence of noise can be described by the substitution in 
the equation (J7J) 

D 2 ^(D 2 + (n-r)g(£)) (16) 



2 ze- 1 



where 



and the parameter r follows from the method of singular value decomposition used in [^] 
Combining Eq. (JJJ) with results (|T3|l and ([TfiJ) we get 



g(z) = — — — (17) 
7i erj(z) 



DETJe) ~ £ {D2+(n-r)g{e/2a)) 



Je 2 /3 + 2a 2 -e/V3\\ 
x exp -nrK 2 (e) 1 + (18) 

where K 2 (e) is the coarse grained entropy of the clean signal. The explicit form of the 
function K 2 (e) is unknown. A good fit that seems to be valid for several systems is 

K 2 (e) = K + bhi(l-ae) (19) 

where the constant k corresponds to the correlation entropy while the second term describes 
the effect of the coarse graining. We stress here that the precise value of the latter function 
is not needed for our approach of noise level estimation because we are left with some free 
parameters. It follows that one can estimate the coarse grained entropy of the signal with 
noise as 

d\n(DET n (e)) 

/ 



K noisy {e) ■ 



I x - ^~ 2a2 v^ l (20) 



V 



where the function g(.) corresponds to the influence of noise on the correlation dimen- 
sion while the second term can be split into the coarse grained entropy of the clean signal 
K 2 (e) and the linear increase of this entropy due to the presence of the external noise 
\pK ( — ^ +2 ° £ /^ ^j K 2 (e). To estimate the noise level a one can use the above depen- 
dence of the correlation entropy K no i sy (e) as the function of the threshold e. However we 
have found that because of a peculiar behavior of K noisy (e) it is more convenient to fit the 
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function K noisy (e) ■ e p instead of K noisy (e) to corresponding experimental data (p is a con- 
stant of order of 1, see next section for discussion). It follows that we need to estimate five 
free parameters k, a, a, b and c for the function 



K noisy (e) ■ e p = -ce p g — )\ne + (k + 6 In (1 - as)) e p [l + y/jl—L (-21) 
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The parameter c (c ranges typically from 0.5 to 0.7) has been introduced for a better agree- 
ment to numerical data. To fit the above function we have used Levenberg-Marquardt 
method |3^|. We stress here we do not need to assume any input value for the above 
coefficients but they appear as a result of application of our method. 



IV. NOISE LEVEL ESTIMATION: EXAMPLES 



In practice, all input parameters of the method can be default. The character of the 
method causes that the evaluation of the embedding dimension that usually appears in 
nonlinear time series analysis is not needed at all. Since in RP we consider lines of all 
lengths larger than 2, the applied here embedding dimension is practically the highest as 
possible for given time series. 

The first point is to calculate the average line length (n) for a given threshold and then to 
find the corresponding entropy K%(£) using the formula ©. Having values of entropies for 
about 100 different thresholds, one should rescale the 5-axis. In such a way different systems 
with different sizes of attractors can be compared. Practically we do this by multiplying e by 
some constant 7, such that e max ■ 7 = e max = 0.7 [e max has been chosen using the condition 
K (e max ) = 0.015). After finding the noise level a in the rescaled data, the corresponding 
noise of the original time series can be calculated as a = a/7. 

One can see the behavior of the fitting function ()21|) for the clean signal in Fig. ^ For a 
small threshold e << e max the dependence is linear since for small e ^(e) is constant. 

The important feature of the plot K noisy ■ (e) e p for noisy data is the appearance of two 
maxima (see Fig. |2J). This feature is helpful for the noise estimation since origins of these 
maxima are related to the first and second part of rhs of Eq. (|2*T| i.e. the first maximum is 
connected to the noise level, while the second maximum to the finiteness of the attractor. 
For a high noise level both maxima merge. The position of the first peak or the single 
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FIG. 1: Chaotic Henon map without a noise. Plot of coarse grained entropy multiplied by the 
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FIG. 2: Chaotic Henon map with measurement noise NTS ~ 10%. Plot of coarse grained entropy 
calculated from the time series multiplied by e - 622 (squares) and the fitting function (|21j) with 
p = 0.622 (line). 



maximum can be used for additional noise estimation because one can find that for 

1 

p ~ 3.441717 



(22) 



the maximum of K noisy ■ (e) e p appears at e = a. The relation ()22j) gives us the second way, 
beside Eq. ()2ip . for estimation of noise level and for the control of results received due to 
the fitting (12T|). 

Let us define the percent of noise as the ratio of a to the standard deviation of data 



%NTS 



a 



■ 100% 



&DATA 



(23) 
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TABLE I: Results of noise level estimation for systems with the measurement noise. 



System 


%NTS 


a 


estimated a 


Henon 


0% 





-0.0023 ± 0.0001 


Henon 


9% 


0.1 


0.1 ±0.0007 


Duffing oscylator 


20% 


0.4 


0.46 ± 0.005 


Duffing oscylator 


55% 


2 


1.9 ±0.02 


Ikeda 


10% 


0.07 


0.07 ±0.0005 


Lorenz 


22% 


2.2 


2.2 ±0.01 


Roessler 


4% 


0.58 


0.58 ±0.012 


Roessler 


14% 


2 


1.75 ±0.01 


Roessler 


35% 


6 


6.16 ±0.2 


Roessler 


48% 


10 


8.94 ±0.1 



The estimated values of the standard deviation a received by an appropriate fit to Eq. (|21|) 
for several systems and noise levels are presented in the table HI One can see a fairly good 
agreement between the estimated and known level of noise. 

We apply this method for chaotic differential equations where the noise fj n is added to 
system states y n , calculated by the fourth order Runge-Kutta algorithm. It follows that next 
joints of the trajectory are depended in a nonlinear way on previous noisy contributions 
(we call this kind of noise a dynamical noise). In fact we consider a noise added to 
the nonlinear map resulting from the original differential equations and the Runge-Kutta 
procedure y n +\ = F (y n + ff n ). We have found that the noise level estimated by our method 
corresponds to the standard deviation of the noise existing in the system a = J (rj%). Fig. |3] 
shows that the behavior of the coarse grained entropy is similar in the presence of dynamical 
and additive noise. 

Results for the dynamical noise and a mixture of two kinds of noise are presented in tables 
ITlland lTTn In the table HH the first three examples correspond the noise added after writing 
the value of a variable into a file and the next examples correspond to the noise added just 
before writing a variable to a file. 

Our method can be useful for evaluation of very high noise levels. Fig. 0] shows the plot 
of the function (J2T]) for the noise (%NTS ~ 100%, p = 1). In such a case the error of 
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FIG. 3: Chaotic Lorenz model with the dynamical noise. Plot of coarse grained entropy calculated 
from the time series multiplied by the threshold e (squares) and the fitting function (|21() with p = 1 
(line) . 




0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 
rescaled threshold e 

FIG. 4: Noise %NTS ~ 100%. Plot of coarse grained entropy calculated from the time series 
multiplied by the threshold e (squares) and the fitting function (|21[) with p = 1 (line) . 

the estimation is large because we are free to use five parameters to fit a simple curve. We 
have found that for high noise levels it is better to use as the fitting function a sum of the 
equations (J21j) with different exponents p (we have used p\ = 0.5 and pi = 7). It follows 
we fit the function K noisy (e) (e Pl + e P2 ). The estimation works better because for different 
values of p the function ()21|) is more sensitive to different noise levels. 

To verify our method in a real experiment we have performed analysis of data generated 



by a nonlinear electronic circuit. The Chua circuit in the chaotic regime 
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361 ] has been 



used and we have added a measurement noise to the outcoming signal. The noise (white 
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TABLE II: Results of noise level estimation for the Lorenz system with the dynamical noise. 



System %NTS a estimated a 

Lorenz 11% 1 1.19 ±0.12 

Lorenz 11% 1 1.17 ±0.15 

Lorenz 11% 1 1.14 ±0.1 

Lorenz 11% 1 1.15 ±0.2 

Lorenz 11% 1 1.11 ±0.18 

Lorenz 11% 1 1.09 ±0.14 



TABLE III: Results of noise level estimation for systems with mixture of measurement and dy- 
namical noise. 



System 



%NTS 



estimated a 



Lorenz 
Lorenz 
Lorenz 
Roessler 
Roessler 
Roessler 



43% 
56% 
35% 



4.06 
5.93 
2.93 
2.82 
33.5 
16.12 



4.56 ±0.12 
5.34 ±0.11 
2.42 ±0.12 
1.97 ±0.12 
32 ± 0.75 
16 ±0.71 



and Gaussian) has come from an electronic noise generator. The results are presented in the 
table II Vl The first two rows correspond to N = 10000 and the rest to N = 1000. In the case 
of a small noise level we can not perform any estimation for a small number of data, because 
the noise is smaller than the average distance between nearest neighbors. The estimation 
for N = 1000 has taken a few minutes 231. 



V. CONCLUSIONS 



In conclusion we have developed a new method of the noise level estimation from time 
series. The method makes use of the functional dependence of the coarse grained entropy 
K 2 (e) on the threshold e. It appears that the peculiar shape of this entropy K 2 (e) depends 
on the standard deviation of the noise a so a simple function fitting can be applied to 
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TABLE IV: Results of noise level estimation for the Chua circuit with the measurement noise. 



%NTS 


a [mV] 


estimated cr[mV] 


0% 





0.15 ±0.015 


3.1% 


30.4 


29.6 ±0.3 


6.2% 


60.8 


61.3 ±8 


12.3% 


121.7 


116 ±8 


24.9% 


243.4 


223 ± 13 


28.3% 


304 


380 ±9 


46.1% 


486 


499 ± 20 


73.7% 


973 


1109 ±52 


90.6% 


1520 


1537 ± 17 


96.5% 


2120 


2042 ± 38 



find the noise level. The process of noise estimation can be done easily without assuming 
input parameters and can be programmed in such a way that the algorithm makes all steps 
automatically. When the length of the time series iV < 5000 the whole evaluation procedure 
takes a few minutes j^J . The method has no limitations regarding a noise level and a kind 
of noise so one can evaluate very high noise levels and a dynamical noise as well. We have 
verified the validity of our method by applying it to estimate the noise level in several chaotic 
systems and in the Chua electronic circuit. 
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